Using Conflicts Among Multiple Base Classifiers to Measure the Performance of Stacking
نویسندگان
چکیده
We analyze the machine learning bias of stacking and point out the conflict problem. Conflicts are defined as base data with different class labels that produced the same predictions by a set of base classifiers. Based on conflicts, we propose conflict-based accuracy estimate to determine the overall accuracy of a stacked classifier and conflict-based accuracy improvement estimate to determine the overall accuracy improvement over base classifiers. We discuss some popular metrics for comparing and evaluating a set of classifiers: coverage, correlated error, diversity and specialty, and show that these metrics do not accurately estimate the overall accuracy of a stacked classifier system. From experimental results, we demonstrate that conflict-based accuracy estimate is an effective measure to predict overall performance and compare different stacked systems, and conflict-based accuracy improvement estimate is a good measure to project the overall accuracy improvement.
منابع مشابه
Fault Detection of Bearings Using a Rule-based Classifier Ensemble and Genetic Algorithm
This paper proposes a reduct construction method based on discernibility matrix simplification. The method works with genetic algorithm. To identify potential problems and prevent complete failure of bearings, a new method based on rule-based classifier ensemble is presented. Genetic algorithm is used for feature reduction. The generated rules of the reducts are used to build the candidate base...
متن کاملApplication of ensemble learning techniques to model the atmospheric concentration of SO2
In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...
متن کاملروشی جدید جهت استخراج موجودیتهای اسمی در عربی کلاسیک
In Natural Language Processing (NLP) studies, developing resources and tools makes a contribution to extension and effectiveness of researches in each language. In recent years, Arabic Named Entity Recognition (ANER) has been considered by NLP researchers due to a significant impact on improving other NLP tasks such as Machine translation, Information retrieval, question answering, query result...
متن کاملClassifier Subset Selection for the Stacked Generalization Method Applied to Emotion Recognition in Speech
In this paper, a new supervised classification paradigm, called classifier subset selection for stacked generalization (CSS stacking), is presented to deal with speech emotion recognition. The new approach consists of an improvement of a bi-level multi-classifier system known as stacking generalization by means of an integration of an estimation of distribution algorithm (EDA) in the first laye...
متن کاملTroika - An improved stacking schema for classification tasks
The idea of ensemble methodology is to build a predictive model by integrating multiple models. It is well-known that ensemble methods can be used for improving prediction performance. Researchers from various disciplines such as statistics, machine learning, pattern recognition, and data mining have considered the use of ensemble methodology. Stacking is a general ensemble method in which a nu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999